Skip to content

Improved local model support#166

Open
bhargav1000 wants to merge 6 commits intohuggingface:mainfrom
bhargav1000:improved-local-model-support
Open

Improved local model support#166
bhargav1000 wants to merge 6 commits intohuggingface:mainfrom
bhargav1000:improved-local-model-support

Conversation

@bhargav1000
Copy link
Copy Markdown

Summary

Adds improved feature-gated local model support for OpenAI-compatible endpoints:

  • ollama/<model>
  • vllm/<model>
  • llamacpp/<model>
  • local://<model>

This wires local routing through LiteLLM, CLI /model, backend validation/catalogs, and the web model selector. Local model UI/API support is gated by ENABLE_LOCAL_MODELS=true; local inference servers must already be running.

Review Follow-up

Addressed review feedback:

  • Warn when reasoning_effort is passed to local models in non-strict mode.
  • Keep strict-mode local reasoning_effort rejection explicit.
  • Removed redundant whitespace validation.
  • Removed fragile sys.path injection in local model validation tests.
  • Added explicit feature-flag coverage for custom local IDs.
  • Documented that local endpoint env vars must remain server-controlled.

Validation

  • uv --cache-dir /tmp/uv-cache run pytest - 178 passed, 3 skipped
  • npm run build from frontend/ - passed, with existing Vite chunk-size warning
  • python -m py_compile agent/core/llm_params.py backend/model_catalog.py - passed

@fglogan
Copy link
Copy Markdown

fglogan commented May 3, 2026

closed per maintainer request

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants